AITopics | nullg null 2

Collaborating Authors

nullg null 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f4661398cb1a3abd3ffe58600bf11322-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 02:48:15 GMT

inequality, nullg null 2, nullnull null 2, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

Supplementary Material (Appendix) Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology A Mathematical Tools

Neural Information Processing SystemsFeb-9-2026, 07:16:17 GMT

Taking the limit on both sides yields the result.

artificial intelligence, inequality, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

5bce843dd76db8c939d5323dd3e54ec9-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 13:25:29 GMT

estimator, exp, inequality, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

A Interpolation

Neural Information Processing SystemsAug-17-2025, 06:53:03 GMT

We now show why this tell us to pick the all-ones vector for SM Kernels: Corollary 4. So, by Lemma 1, we complete the proof. With this reduction in place, we move onto consider the means and lengthscales of our kernel. C for all ξ, proven below. C.1 Proof for the Matrix Case First, we introduce the matrix version of the ridge leverage function, first introduced in [AM15]: Definition 3. F or a matrix A R A + εI) Then we move onto the theorem we want to prove: 16 Theorem 5. We bound these two terms separately, starting with the latter. Hence, by Markov's inequality, we have null( S (A C.2 Proof for the Operator Case We start with preliminary definitions for randomized operator analysis.

artificial intelligence, inequality, nullnull null 2, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

Supplementary Material (Appendix) Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology A Mathematical Tools

Neural Information Processing SystemsAug-15-2025, 01:22:01 GMT

Taking the limit on both sides yields the result.

inequality, nullx null 2 2, probability, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Max-Linear Regression by Scalable and Guaranteed Convex Programming

Kim, Seonho, Bahmani, Sohail, Lee, Kiryung

arXiv.org Machine LearningMar-11-2021

We consider the multivariate max-linear regression problem where the model parameters $\boldsymbol{\beta}_{1},\dotsc,\boldsymbol{\beta}_{k}\in\mathbb{R}^{p}$ need to be estimated from $n$ independent samples of the (noisy) observations $y = \max_{1\leq j \leq k} \boldsymbol{\beta}_{j}^{\mathsf{T}} \boldsymbol{x} + \mathrm{noise}$. The max-linear model vastly generalizes the conventional linear model, and it can approximate any convex function to an arbitrary accuracy when the number of linear models $k$ is large enough. However, the inherent nonlinearity of the max-linear model renders the estimation of the regression parameters computationally challenging. Particularly, no estimator based on convex programming is known in the literature. We formulate and analyze a scalable convex program as the estimator for the max-linear regression problem. Under the standard Gaussian observation setting, we present a non-asymptotic performance guarantee showing that the convex program recovers the parameters with high probability. When the $k$ linear components are equally likely to achieve the maximum, our result shows that a sufficient number of observations scales as $k^{2}p$ up to a logarithmic factor. This significantly improves on the analogous prior result based on alternating minimization (Ghosh et al., 2019). Finally, through a set of Monte Carlo simulations, we illustrate that our theoretical result is consistent with empirical behavior, and the convex estimator for max-linear regression is as competitive as the alternating minimization algorithm in practice.

estimator, nullg null 2, regression, (15 more...)

arXiv.org Machine Learning

2103.0702

Country:

North America > Canada > Alberta (0.14)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Learning high-dimensional probability distributions using tree tensor networks

Grelier, Erwan, Nouy, Anthony, Lebrun, Régis

arXiv.org Machine LearningDec-17-2019

We consider the problem of the estimation of a high-dimensional probability distribution using model classes of functions in tree-based tensor formats, a particular case of tensor networks associated with a dimension partition tree. The distribution is assumed to admit a density with respect to a product measure, possibly discrete for handling the case of discrete random variables. After discussing the representation of classical model classes in tree-based tensor formats, we present learning algorithms based on empirical risk minimization using a $L^2$ contrast. These algorithms exploit the multilinear parametrization of the formats to recast the nonlinear minimization problem into a sequence of empirical risk minimization problems with linear models. A suitable parametrization of the tensor in tree-based tensor format allows to obtain a linear model with orthogonal bases, so that each problem admits an explicit expression of the solution and cross-validation risk estimates. These estimations of the risk enable the model selection, for instance when exploiting sparsity in the coefficients of the representation. A strategy for the adaptation of the tensor format (dimension tree and tree-based ranks) is provided, which allows to discover and exploit some specific structures of high-dimensional probability distributions such as independence or conditional independence. We illustrate the performances of the proposed algorithms for the approximation of classical probabilistic models (such as Gaussian distribution, graphical models, Markov chain).

approximation, representation, tensor format, (15 more...)

arXiv.org Machine Learning

1912.07913

Country:

Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback